-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote configuration support #488
Conversation
1618c4c
to
f92d3f1
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #488 +/- ##
==========================================
+ Coverage 71.17% 71.83% +0.66%
==========================================
Files 220 237 +17
Lines 30000 32871 +2871
==========================================
+ Hits 21351 23612 +2261
- Misses 8649 9259 +610
|
e16f00b
to
cb13b36
Compare
} | ||
|
||
#[cfg(feature = "test")] | ||
pub mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having the helpers isolated in another module for the sake of clarity when included in another crate but seems a bit odd having a tests module with no actual tests, if you don't plan to add tests to this module it would probably be clearer to name it 'helpers' or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather add a submodule to tests then, like mod tests { mod helpers { ... } }
for clarity that it's test-only code
Can you add some examples to the codebase/comments so example/expected usage is shown in |
b89fdee
to
95c7998
Compare
@brettlangdon Implemented some helper functionality to auto-parse, diff and locally store remote config files. Please have a look at examples/remote_config_fetch.rs. |
b82b2a8
to
7472b3a
Compare
Please split this up in separate PR, nobody can review this much code at once. Also RC should be in the code owners of the RC implementation like for other tracers |
Please consider the files in remote-config individually: Then parse.rs to process remote config paths / provide an entry point to the individual products. I don't intend to split this RC beyond the point where I can do minimal end-to-end testing. EDIT: Updated CODEOWNERS accordingly. |
40e1135
to
7515252
Compare
Signed-off-by: Bob Weinand <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please split the review so that it can reviewed
remote-config/src/fetch/fetcher.rs
Outdated
} | ||
|
||
#[derive(Default)] | ||
pub struct OpaqueState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This state is opaque and not exposed outside of RC backends remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementing document says about the opaque_backend_state:
MUST contain the opaque_backend_state field extracted from targets.
This is used to make the backend stateless by saving some opaque lightweight data directly in the agent and tracers.
For tracer clients this is mainly used for accurate tracking of where configurations are sent.
I.e. that I'll have to extract and submit it unmodified with each request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what Dario is trying to capture is that you need to extract the byte sequence representing the opaque_backend_state, but you are not to parse it or attempt to interpret it. Tracers never have to utilize the information contained within it, and the RC backend is free to store whatever it wants inside with no contract established with tracers. As a result storing it as simply a Vec is all you should do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, ok I think I see now. client_state in here refers to the field targets.signed.custom.opaque_backend_state
extracted from the signed targets. The other data in here is what you are using to track the current state to build the next fetch request. I think you should rename client_state
to opaque_backend_state
and the struct itself should not be OpaqueState
and instead be ClientState
. On initial glance it looks like you're trying to parse the opaque state and store it in this struct, but that seems to be not the case.
Signed-off-by: Bob Weinand <[email protected]>
And avoid computing the RemoteConfigPath string with every HashMap operation, but do some rust magic so that it will consider owned RemoteConfigPaths and unowned RemoteConfigPathRefs equivalent. Signed-off-by: Bob Weinand <[email protected]>
4c4ca10
to
1f50100
Compare
…config Signed-off-by: Bob Weinand <[email protected]>
733ddfa
to
d297ee7
Compare
Signed-off-by: Bob Weinand <[email protected]>
d297ee7
to
37a4fea
Compare
Signed-off-by: Bob Weinand <[email protected]>
Signed-off-by: Bob Weinand <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core loop looks good to me, aside from a few small minor comments. (and one question, that might have actually uncovered a gap in our system tests that RC should address)
I do want to go over the multi target setup more though. Is there a design doc on how you are handling this setup at a high-level? I think that would really aid my review, and also any future maintenance that must be done on this client. If there isn't such a doc, we can meet and talk about this but I think we want to get something like this on file just in case there are any needs to collaborate with RC on debugging as this is a new concept being introduced.
remote-config/src/fetch/fetcher.rs
Outdated
} | ||
|
||
#[derive(Default)] | ||
pub struct OpaqueState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what Dario is trying to capture is that you need to extract the byte sequence representing the opaque_backend_state, but you are not to parse it or attempt to interpret it. Tracers never have to utilize the information contained within it, and the RC backend is free to store whatever it wants inside with no contract established with tracers. As a result storing it as simply a Vec is all you should do.
remote-config/src/fetch/fetcher.rs
Outdated
} | ||
|
||
#[derive(Default)] | ||
pub struct OpaqueState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, ok I think I see now. client_state in here refers to the field targets.signed.custom.opaque_backend_state
extracted from the signed targets. The other data in here is what you are using to track the current state to build the next fetch request. I think you should rename client_state
to opaque_backend_state
and the struct itself should not be OpaqueState
and instead be ClientState
. On initial glance it looks like you're trying to parse the opaque state and store it in this struct, but that seems to be not the case.
remote-config/src/fetch/fetcher.rs
Outdated
let computed_hash = hasher(decoded.as_slice()); | ||
if hash != computed_hash { | ||
warn!("Computed hash of file {computed_hash} did not match remote config targets file hash {hash} for path {path}: file: {}", String::from_utf8_lossy(decoded.as_slice())); | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we had a system test for this, which you hooked up today and passed, but it looks like if a computed hash doesn't match the loop just continues - when an invalid file should fail the whole payload with an error. Just for my own comfort, where is that handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make the whole thing an error too, sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking over the system tests and I actually don't think have a test for this 👀.
In theory this should never happen unless somebody is manipulating data in the middle. If it was happening due to an RC bug we have a catastrophic error in our system. (it's likely to be caught well before it gets to the tracer)
Since we don't have a system test, we can't really hold tracers to any implementation at this time. I think I should add one and we should get convergence from tracers - my recommendation is to follow the RFC and the behavior we will eventually enforce via tests, and if one hash is bad fail the entire update so that this gets fast reported up the chain. (we have logs for client failures and monitoring for major issues like this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Promoted to error.
Signed-off-by: Bob Weinand <[email protected]>
remote-config/src/fetch/single.rs
Outdated
fetcher: ConfigFetcher::new(sink, Arc::new(ConfigFetcherState::new(invariants))), | ||
target: Arc::new(target), | ||
runtime_id, | ||
config_id: uuid::Uuid::new_v4().to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this config_id represent?
A given client doesn't know about its config ID's ahead of time, and that can constantly be changing depending on configurations being created or deleted by users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is used as the client_id in the fetch logic, so I think you want to rename this for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to client_id.
remote-config/src/fetch/shared.rs
Outdated
pub client_id: String, | ||
cancellation: CancellationToken, | ||
/// Interval used if the remote server does not specify a refetch interval, in nanoseconds. | ||
pub default_interval: AtomicU64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the review of the primary RFC loop - that info is for agents and not tracer clients so this can remain constant. (I believe most tracers poll at 1s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
remote-config/src/fetch/fetcher.rs
Outdated
if endpoint.api_key.is_some() { | ||
if parts.scheme.is_none() { | ||
parts.scheme = Some(Scheme::HTTPS); | ||
parts.authority = Some( | ||
format!("{}.{}", subdomain, parts.authority.unwrap()) | ||
.parse() | ||
.unwrap(), | ||
); | ||
} | ||
parts.path_and_query = Some(PathAndQuery::from_static("/api/v0.1/configurations")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's ever a use case where a tracer client should be talking to /api/v0.1/configurations as that's the RC backend. The RC backend is not designed to talk directly to tracers and only to what we call the "core remote config service" running in the agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's sort of unused, just makes the code do something valid at this place, the actual run loop has if self.state.endpoint.api_key.is_some() { return Err() }
.
I don't intend to use it without thorough discussion with RC team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have no plans to allow tracers to talk directly to the remote config backend, and the payload is fundamentally different so you wouldn't even be able to handle the payload with the current code. This needs to be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever, it's not doing anything, but removed.
Signed-off-by: Bob Weinand <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context for review: Remote Config platform does not own and maintain tracer clients, but we do own the contract between the agent and tracer clients as defined by the [Tracer Client RFC].(https://docs.google.com/document/d/1u_G7TOr8wJX0dOM_zUDKuRJgxoJU_hVTd5SeaMucQUs/edit?usp=sharing) My review focused on whether the code conformed with the RFC.
I have reviewed in detail the single fetcher use case for the RFC RC owns (and requested and received confirmation of it passing our system tests for the RFC, thanks for doing that! 🥳 ). I also reviewed the multi-fetcher use case at a high level, primarily how it layers everything together on top of the aforementioned reviewed single fetcher.
993e00d
to
da37f2e
Compare
Also including Dynamic Configuration, but it's a small part, just 100 lines of code, describing the structures it contains.
As to more specific changes to current code:
Note AsBytes in ddcommon currently requires &'a AsBytes<'a>, while only &AsBytes<'a> (independent lifetime) is required.
General notes: